One thing that we wanted to see was what this would look like in a 3D Plot. Here we will use R because of it’s rgl package, which allows for fast 3D plotting. But first we have to use Principal Component Analysis to reduce the number of Dimensions.

library(bpca)
library(rgl)
library(ggfortify)
library(ggplot2)
library(RColorBrewer)
library(knitr)
library(rglwidget)
knit_hooks$set(webgl = hook_webgl)

setwd("/Users/bobminnich/Documents/Columbia/Courses/Data_Mining/Examples/DigitReader")
data = as.data.frame(read.csv("train.csv", header = TRUE, sep = ","))
labels2 = data[,1]
labels_frame <- as.data.frame(data[,1])
labels_frame <- setNames(labels_frame, c("l"))
data_r = data[,-1]
pca = prcomp(data_r)
plot.new()
screeplot(pca, main = "PCA Plot of NIST data", type = "lines")

We can see from the Screeplot that we might not be able to use 3 Principal components in order to reduce the number of dimensions but we will continue to see if we can understand anything visually from the plot.

dev.off()
## null device 
##           1
colorpal = c("#E41A1C", "#0066ff", "#4DAF4A", "#984EA3", "#FF7F00", "#FFFF33", "#A65628","#ff37cb","#66ff33", "#00ffff")

#Find colors associtated with labels and apply the color palet
for(i in 1:10){
  labels_frame$color[data$label == i-1] = colorpal[i]
}
#Used {r testgl, webgl=TRUE, } for R Chunk
#Plotting
plot3d(pca$x[,1:3],col = labels_frame$color, size = 1)
Legend : Scroll to Zoom, Click and drag to rotate
Num 0 Num 1 Num 2 Num 3 Num 4
Num 5 Num 6 Num 7 Num 8 Num 9

You must enable Javascript to view this page properly.

We end up with a very cool looking plot that allows us to turn and zoom on specific areas. One thing that is quite noticable is that the 1s are someone out on their own. This makes sense because out of all of the numbers the 1 is proably the most unquie of them.